home *** CD-ROM | disk | FTP | other *** search
Text File | 1990-10-20 | 103.9 KB | 2,994 lines |
- 1 Randy Hyde's Standard Library for 8086 Assembly Language Programmers
-
- This software is ...
-
- sssssss ss ss ss sssssss sssssss
- ss ss ss ssss ss ss ss
- ss ss ss ss ss ss ss ss
- sssssss sssssssss ssssssss sssssss sssss ssssssss
- ss ss ss ss ss ss ss ss
- ss ss ss ss ss ss ss ss
- sssssss ss ss ss ss ss ss sssssss
-
-
-
- ww ww ww sssssss sssssss
- ww ww wwww ss ss ss
- ww ww ww ww ww ss ss ss
- ww wwww ww wwwwwwww sssssss sssss
- ww ww ww ww ww ww ss ss ss
- wwww wwww ww ww ss ss ss
- ww ww ww ww ss ss sssssss
-
-
-
- 'cuz I'm sharing it with you!
-
-
- I do not want any registrations or fees for the use of this software. I
- thank God and Jesus Christ (my personal saviour) for giving me the ability to
- write such software. God wants all of us to use our talents to glorify him,
- therefore I offer this software as such.
-
- Now for the catch... It is more blessed to give than to receive. If this
- software saves you time and effort and you enjoy using it, my life will be
- enriched knowing that others have appreciated my work. I would like to share
- this wonderful feeling with you. If you like this software and use it, I would
- like you to contribute at least one routine to the library. Perhaps you think
- this library has some neet-o routines in it. Imagine how nice it would become
- if everyone used their imagination to contribute something useful to it.
-
- I hereby release this software to the public domain. You can use it in
- any way you see fit. However, I would appreciate it if you share this software
- with other much as I've shared it with you. I'm not suggesting that you give
- away software you've written with this package (I'm not quite as crazy as
- Richard Stallman, bless his heart), but if someone else would like a copy of
- this library, please help them out. Naturally, I'd be tickeled pink to receive
- credit in software that uses these routines (which is the honorable thing to
- do) but I understand the way many corporations operate and won't be terrible
- put off if you use it without giving due credit. Enjoy!
-
- If you have comments, bug reports, new code to contribute, etc., you can
- reach me at:
-
- rhyde (On BIX).
- rhyde@cs.ucr.edu (On Internet).
- rhyde@ucrmath.ucr.edu (On Internet, this one may go away).
-
- or
-
- Randy Hyde
- Dept of Computer Science
- 2208 Sproul Hall
- University of California
- Riverside, Ca. 92521-0135
- or
- Randy Hyde
- c/o Braintec Corporation
- 10 Corporate Park Way, ste 110
- Irvine, Ca. 92714
-
-
- 1.1 Comments about the code
-
- This code has received very little testing. C'mon, whadda expect for
- free? I've been cranking this stuff out as fast as possible without going back
- and reworking anything I've done. The only exception has been modification of
- the routines to use the es:di/dx:si register pairs rather than es:si/ds:di
- register pairs. I expect those modifications introduced more bugs. Please
- don't expect super optimal code here. I have had anytime to study and improve
- this code. Most of it is fairly mediocre (from a size/speed point of view).
- Hopefully, you'll agree, it's the idea that counts. If you don't like
- something I've done, you've got the sources -- have at it. (Of course, I'd
- appreciate it if you would send me any modifications.)
-
-
-
- 1.2 Wish List
-
- Next, I'll be working on FILE I/O versions of the I/O routines in this
- package. Sooner or later I'll get around to adding floating point routines to
- this package. If you're interested in adding some routines to this package,
- GREAT!
-
- Routines I'd like to have but am too busy to work on now:
-
- 1) Routines which manipulate directories (read/write/etc.)
-
- 2) A regular expression interpreter.
-
- 3) Length-prefixed strings package.
-
- 4) A windowing package.
-
- 5) A graphics package.
-
- 6) An object-oriented programming class library.
-
- 7) Just about anything else appearing in a HLL "standard" library.
-
- If you've got any ideas, I'd love to discuss them with you. Best bet is
- to reach me electronically at the E-MAIL addresses above.
-
-
- 1.3 Missing Routines to Supply RSN
-
- String package:
-
- strins Inserts one string into the middle of another
-
- strdel Deletes a sequence of characters from the middle of a string.
-
- Character Set Package:
-
- span- Skips through a sequence of characters in a string which belong to
- a character set.
-
- break- Skips through a sequence of characters in a string which do not
- belong to a character set.
-
- Memory Manager Package
-
- Memavail- Largest block of free memory available on the heap.
-
- Memfree- Total amount of free space on the heap.
-
-
-
- 2 Character Output Routines
-
- 2.1 Putc
-
- * Outputs character in AL register to the standard output device.
-
- * Output is redirectable to user-written routine.
-
- Inputs: AL- character to print.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Putc is the primitive character output routine. Most other output routines in
- the standard library output data through this procedure. Prints the
- ASCII character in AL. Processing of control codes is undefined
- although most output routines this guy links to should be able to
- handle return, line feed, back space, and tab. By default, this
- routine calls DOS to print the character to the standard output
- device.
-
- Example:
-
- mov al, 'C'
- putc ;Prints "C" to std output.
-
-
- 2.2 PutCR
-
- * Easy way of printing a newline to the stdlib standard output.
-
- Inputs: None.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Prints a newline (carriage return/line feed) to the current standard
- output device.
-
- Example:
-
- PutCR
-
-
- 2.3 PutcStdOut
-
- * Outputs character in AL to the DOS standard output device.
-
- * Sends a character directly to the DOS std output device.
-
- * Output is redirectable via DOS I/O redirection.
-
- * Bypasses redirection through the standard library Putc routine.
-
- Inputs: AL- character to output.
-
- Outputs: None.
-
- Include: stdlib.a
-
- PutcStdOut calls DOS to print the character in AL to the standard output
- device. Although processing of non-ASCII characters and control characters is
- undefined, most output devices handle these characters properly. In
- particular, most output devices properly handle return, line feed, back space,
- and tab.
-
- Example:
-
- mov al, 'C'
- PutcStdOut ;Writes "C" to std output.
-
-
- 2.4 PutcBIOS
-
- * Prints character in AL to the display device by calling BIOS.
-
- * Cannot be redirected by stdlib or by DOS.
-
- * Uses INT 10H/AH=14 for teletype-like output.
-
- * Handles return, line feed, back space, and tab. Prints other
- control characters using the IBM Character set.
-
- Inputs: AL- Character to print.
-
- Outputs: None.
-
- Include- stdlib.a
-
- PutcBIOS prints the character in AL using the BIOS routines. Output
- through this routine cannot be redirected, such output is always sent to the
- video display on the PC (unless, of course, someone has patched INT 10h).
-
- Example:
-
- mov al, "C"
- PutcBIOS
-
-
- 2.5 GetOutAdrs
-
- * Retrieves address of the current output routine.
-
- Inputs: None.
-
- Outputs: es:di - address of current output routine (called by Putc).
-
- Include: stdlib.a
-
- You can use this function to get the address of the current output
- routine, perhaps so you can save it or see if it is currently pointing at some
- particular piece of code. If you want to temporarily redirect the output and
- then restore the original output routine, consider using PushOutAdrs/PopOutAdrs
- described later.
-
- Example:
-
- GetOutAdrs
- mov word ptr SaveOutAdrs, di
- mov word ptr SaveOutAdrs+2, es
-
-
- 2.6 SetOutAdrs
-
- * Lets you set the address of the current output routine.
-
- Inputs: es:di- Address of new output routine.
-
- Outputs: None.
-
- Include: stdlib.a
-
- This routine redirects the stdlib standard output so that it calls the
- routine whose address you pass in es:di. This routine should expect the
- character in AL and must preserve all registers. At a bare minimum, it should
- handle the printable ASCII characters and the four control characters return,
- line feed, back space, and tab (unless, of course, the main purpose of this
- routine is to handle these codes in a different fashion).
-
- Example:
-
- mov es, seg NewOutputRoutine
- mov di, offset NewOutputRoutine
- SetOutAdrs
- .
- .
- .
- les di, RoutinePtr
- SetOutAdrs
-
-
- 2.7 PushOutAdrs
-
- * Lets you redirect the standard output device and preserve the
- previous address.
-
- * Saves up to 16 old output routine addresses on an internal stack.
-
- * Restoration is possible using PopOutAdrs.
-
- Inputs: es:di- Address of new output routine.
-
- Outputs: Carry=0 if operation successful.
- Carry=1 if there were already 16 items on the stack.
-
- Include: stdlib.a
-
- This routine "pushes" the current output address onto an internal stack
- and then stores the value in es:di into the current output routine pointer.
- The PushOutAdrs and PopOutAdrs routines let you easily save and redirect the
- standard output and then restore the original output routine address later on.
-
- If you attempt to push more than 16 items on the stack, PushOutAdrs will
- ignore your request and return with the carry flag set. If PushOutAdrs is
- successful, it will return with the carry flag clear.
-
- Example:
-
- mov es, seg NewOutputRoutine
- mov di, offset NewOutputRoutine
- PushOutAdrs
- .
- .
- .
- les di, RoutinePtr
- PushOutAdrs
-
-
- 2.8 PopOutAdrs
-
- * Restores output routine addresses saved by PushOutAdrs.
-
- * Defaults to PutcStdOut if you attempt to pop too many items off the
- stack.
-
- Inputs: None.
-
- Outputs: es:di- Points at the previous stdout routine before the pop.
-
- Include: stdlib.a
-
- PopOutAdrs undoes the effects of PushOutAdrs. It pops an item off the
- internal stack and stores it into the output routine pointer. The previous
- value in the output pointer is returned in es:di.
-
- Example:
-
- mov es, seg NewOutputRoutine
- mov di, offset NewOutputRoutine
- PushOutAdrs
- .
- .
- .
- PopOutAdrs
-
-
- 2.9 Puts
-
- * Outputs a string of characters to the stdlib standard output device.
-
- * Calls putc for each character in the string thereby sending each
- character out to the standard output device.
-
- Inputs: es:di- Contains the address of the string to print.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Puts prints a zero-terminated string whose address appears in es:di. Each
- character appearing in the string is printed verbatim. There are no special
- escape characters. Unlike the "C" routine by the same name, puts does not
- print a newline after printing the string. Use putcr if you want to print the
- newline after printing a string with puts.
-
- Example:
-
- les di, StrToPrt
- puts
- putcr
-
-
- 2.10 Puth
-
- * Outputs the byte in AL as two hex digits (including leading zero if
- necessary).
-
- * Calls stdlib putc routine to print both characters to the stdlib
- standard output device.
-
- Inputs: AL- Value to print.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Prints the value in the AL register as two hexadecimal digits. If the
- value in AL is between 0 and 0Fh, puth will print a leading zero. This routine
- calls the stdlib standard output routine (putc) to print all characters.
-
- Example:
-
- mov al, 1fh
- puth
-
-
- 2.11 Putw
-
- * Outputs the word in AX as four hex digits (including leading zeros
- if necessary).
-
- * Calls stdlib putc routine to print characters to the stdlib standard
- output device.
-
- Inputs: AX- Value to print.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Prints the value in the AX register as four hexadecimal digits. If the
- value in AX is between 0 and 0Fh, puth will print a leading zero. This routine
- calls the stdlib standard output routine (putc) to print all characters.
-
- Example:
-
- mov ax, 0f1fh
- putw
-
-
- 2.12 Puti
-
- * Outputs the word in AX as a signed decimal number (including minus
- sign, if necessary).
-
- * Calls stdlib putc routine to print characters to the stdlib standard
- output device.
-
- Inputs: AX- Value to print.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Prints the value in the AX register as a decimal integer. This routine
- uses the exact number of screen positions required to print the number
- (including a position for the minus sign, if the number is negative). This
- routine calls the stdlib standard output routine (putc) to print all
- characters.
-
- Example:
-
- mov ax, -1234
- puti
-
-
- 2.13 Putu
-
- * Outputs the word in AX as an unsigned decimal number.
-
- * Calls stdlib putc routine to print both characters to the stdlib
- standard output device.
-
- Inputs: AX- Value to print.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Prints the value in the AX register as a decimal integer. This routine
- uses the exact number of screen positions required to print the number. This
- routine calls the stdlib standard output routine (putc) to print all
- characters.
-
- Example:
-
- mov ax, 1234
- putu
-
-
- 2.14 Putl
-
- * Outputs the double word in DX:AX as a signed decimal number
- (including minus sign, if necessary).
-
- * Calls stdlib putc routine to print characters to the stdlib standard
- output device.
-
- Inputs: DX:AX- Value to print.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Prints the value in the DX:AX registers as a decimal integer. This
- routine uses the exact number of screen positions required to print the number
- (including a position for the minus sign, if the number is negative). This
- routine calls the stdlib standard output routine (putc) to print all
- characters.
-
- Example:
-
- mov dx, 0ffffh
- mov ax, -1234
- putl
-
-
- 2.15 Putul
-
- * Outputs the double word in DX:AX as an unsigned decimal number
- (including minus sign, if necessary).
-
- * Calls stdlib putc routine to print characters to the stdlib standard
- output device.
-
- Inputs: DX:AX- Value to print.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Prints the value in the DX:AX registers as a decimal integer. This
- routine uses the exact number of screen positions required to print the
- number. This routine calls the stdlib standard output routine (putc) to print
- all characters.
-
- Example:
-
- mov dx, 12h
- mov ax, 1234
- putul
-
-
- 2.16 PutISize
-
- * Prints the value in AX as a signed decimal integer.
-
- * Prints the number in a minimum field width specified by the value in
- CX.
-
- Inputs: AX- Value to print.
- CX- Minimum number of print positions to use.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- PutISize prints the signed integer value in AX to the stdlib standard
- output device using a minimum of n print positions. CX contains n, the minimum
- field width for the output value. The number (including any necessary minus
- sign) is printed right justified in the output field.
-
- If the number in AX requires more print positions than specified by CX,
- PutISize uses however many print positions are necessary to actually print the
- number. If you specify zero in CX, PutISize uses the minimum number of print
- positions required. Of course, PutI will also use the minimum number of print
- positions without disturbing the value in the CX register.
-
- Note that, under no circumstances, will the number in AX ever require more
- than size print positions (-32,767 requires the most print positions).
-
- Examples:
-
- mov cx, 5
- mov ax, I
- PutISize
- .
- .
- .
- mov cx, 12
- mov ax, J
- PutISize
-
-
- 2.17 PutUSize
-
- * Prints the value in AX as an unsigned decimal integer.
-
- * Prints the number in a minimum field width specified by the value in
- CX.
-
- Inputs: AX- Value to print.
- CX- Minimum number of print positions to use.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Like PutISize above except this guy prints unsigned values. Note that the
- maximum number of print positions required by any number (e.g., 65,535) is
- five.
-
- Example:
-
- mov cx, 8
- mov ax, U
- PutUSize
-
-
- 2.18 PutLSize
-
- * Prints the value in DX:AX as a long signed decimal integer.
-
- * Prints the number in a minimum field width specified by the value in
- CX.
-
- Inputs: DX:AX- Value to print.
- CX- Minimum number of print positions to use.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Like PutISize above, except this guy prints the long integer value in
- DX:AX. Note that there may be as many as 11 print positions (e.g.,
- -1,000,000,000).
-
- Example:
-
- mov cx, 16
- mov dx, word ptr L+2
- mov ax, word ptr L
- PutLSize
-
-
- 2.19 PutULSize
-
- * Prints the value in DX:AX as a long unsigned decimal integer.
-
- * Prints the number in a minimum field width specified by the value in
- CX.
-
- Inputs: DX:AX- Value to print.
- CX- Minimum number of print positions to use.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Just like PutLSize above except this guy prints unsigned numbers rather
- than signed long integers. The largest field width for such a value is 10
- print positions.
-
- Example:
-
- mov cx, 8
- mov dx, word ptr UL+2
- mov ax, word ptr UL
- PutULSize
-
-
- 2.20 Print
-
- * Prints a string literal.
-
- * Very convenient to use.
-
- * Calls stdlib putc routine to print characters to the stdlib standard
- output device.
-
- Inputs: CS:RET - Return address points at the string to print.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Print lets you print string literals in a convenient fashion. The string
- to print immediately follows the call to the print routine. The string must
- contain a zero terminating byte and may not contain any intervening zero
- bytes. Since the print routine returns to the address immediately following
- the zero terminating byte, forgetting this byte or attempting to print a zero
- byte in the middle of a literal string will cause print to return to an
- unexpected instruction. This usually hangs up the machine. Be very careful
- when using this routine!
-
- Example:
-
- print
- db "Print this string to the display device"
- db 13,10
- db "This appears on a new line"
- db 13,10
- db 0
-
-
- 2.21 Printf
-
- * Formatted output routine.
-
- * Very similar to the "C" function of the same name.
-
- * Prints integers (normal, long, unsigned, etc.), characters, strings,
- and other data types (this routine, however, does not support
- floating point output).
-
- * Calls stdlib putc routine to print characters to the stdlib standard
- output device.
-
- Inputs: CS:RET - Return address points at the format string.
-
- Outputs: None.
-
- Include: Stdlib.a
-
- Printf, like its "C" namesake, provides formatted output capabilities for
- the stdlib package. A typical call to printf always takes the following form:
-
- printf
- db "format string",0
- dd operand1, operand2, ..., operandn
-
- The format string is comparable to the one provided in the "C" programming
- language. For most characters, printf simply prints the characters in the
- format string up to the terminating zero byte. The two exceptions are
- character prefixed by a backslash ("\") and character prefixed by a percent
- sign ("%"). Like C's printf, stdlib's printf uses the backslash as an escape
- character and the percent sign as a lead-in to a format string.
-
- Printf uses the escape character ("\") to print special characters in a
- fashion similar to, but not identical to C's printf. Stdlib's printf routine
- supports the following special characters:
-
- * r Print a carriage return (but no line feed)
-
- * n Print a new line character (carriage return/line feed).
-
- * b Print a backspace character.
-
- * t Print a tab character.
-
- * l Print a line feed character (but no carriage return).
-
- * f Print a form feed character.
-
- * \ Print the backslash character.
-
- * % Print the percent sign character.
-
- * 0xhh Print ASCII code hh, represented by two hex digits.
-
-
- C users should note a couple of differences between stdlib's escape
- sequences and C's. First, use "\%" to print a percent sign within a format
- string, not "%%". C doesn't allow the use of "\%" because the C compiler
- processes "\%" at compile time (leaving a single "%" in the object code)
- whereas printf processes the format string at run-time. It would see a single
- "%" and treat it as a format lead-in character. Stdlib's printf, on the other
- hand, processes both the "\" and "%" and run-time, therefore it can distinguish
- "\%".
-
- Strings of the form "\0xhh" must contain exactly two hex digits. The
- current printf routine isn't robust enough to handle sequences of the form
- "\0xh" which contain only a single hex digit. Keep this in mind if you find
- printf chopping off characters after you print a value.
-
- There is absolutely no reason to use any escape character sequences except
- "\0x00". Printf grabs all characters following the call to printf up to the
- terminating zero byte (which is why you'd need to use "\0x00" if you want to
- print the null character, printf will not print such values). Stdlib's printf
- routine doesn't care how those characters got there. In particular, you are
- not limited to using a single string after the printf call. The following is
- perfectly legal:
-
- printf
- db "This is a string",13,10
- db "This is on a new line",13,10
- db "Print a backspace at the end of this line:"
- db 8,13,10,0
-
- You code will run a tiny amount faster if you avoid the use of the escape
- character sequences. More importantly, the escape character sequences take at
- least two bytes. You can encode most of them as a single byte by simply
- embedding the ASCII code for that byte directly into the code stream. Don't
- forget, you cannot embed a zero byte into the code stream. A zero byte
- terminates the format string. Instead, use the "\0x00" escape sequence.
-
- Format sequences always between with "%". For each format sequence you
- must provide a far pointer to the associated data immediately following the
- format string, e.g.,
-
- printf
- db "%i %i",0
- dd i,j
-
- Format sequences take the general form "%s\cn^f" where:
-
- * "%" is always the "%" character. Use "\%" if you actually want
- to print a percent sign.
-
- * s is either nothing or a minus sign ("-").
-
- * "\c" is also optional, it may or may not appear in the format
- item. "c" represents any printable character.
-
- * "n" represents a string of 1 or more decimal digits.
-
- * "^" is just the caret (up-arrow) character.
-
- * "f" represents one of the format characters: i, d, x, h, u, c,
- s, ld, li, lx, or lu.
-
- The "s", "\c", "n", and "^" items are optional, the "%" and "f" items must
- be present. Furthermore, the order of these items in the format item is very
- important. The "\c" entry, for example, cannot precede the "s" entry.
- Likewise, the "^" character, if present, must follow everything except the "f"
- character(s).
-
- The format characters i, d, x, h, u, c, s, ld, li, lx, and lu control the
- output format for the data. The i and d format characters perform identical
- functions, they tell printf to print the following value as a 16-bit signed
- decimal integer. The x and h format characters instruct printf to print the
- specified value as a 16-bit or 8-bit hexadecimal value (respectively). If you
- specify u, printf prints the value as a 16-bit unsigned decimal integer. Using
- c tells printf to print the value as a single character. S tells printf that
- you're supplying the address of a zero-terminated character string, printf
- prints that string. The ld, li, lx, and lu entries are long (32-bit) versions
- of d/i, x, and u. The corresponding address points at a 32-bit value which
- printf will format and print to the standard output. The following example
- demonstrates these format items:
-
- printf
- db "I= %i, U= %u, HexC= %h, HexI= %x, C= %c, "
- db "S= %s",13,10
- db "L= %ld",13,10,0
- dd i,u,c,i,c,s,l
-
- The number of far addresses (specified by operands to the "dd"
- pseudo-opcode) must match the number of "%" format items in the format string.
- Printf counts the number of "%" format items in the format string and skips
- over this many far addresses following the format string. If the number of
- items do not match, the return address for printf will be incorrect and the
- program will probably hang or otherwise malfunction. Likewise (as for the
- print routine), the format string must end with a zero byte. The addresses of
- the items following the format string must point directly at the memory
- locations where the specified data lies.
-
- When used in the format above, printf always prints the values using the
- minimum number of print positions for each operand. If you want to specify a
- minimum field width, you can do so using the "n" format option. A format item
- of the format "%10d" prints a decimal integer using at least ten print
- positions. Likewise, "%16s" prints a string using at least 16 print
- positions. If the value to print requires more than the specified number of
- print positions, printf will use however many are necessary. If the value to
- print requires fewer, printf will always print the specified number, padding
- the value with blanks. Printf will print the value right justified in the
- print field (regardless of the data's type). If you want to print the value
- left justified in the output file, use the "-" format character as a prefix to
- the field width, e.g.,
-
- printf
- db "%-17s",0
- dd string
-
- In this example, printf prints the string using a 17 character long field with
- the string left justified in the output field.
-
- By default, printf blank fills the output field if the value to print
- requires fewer print positions than specified by the format item. The "\c"
- format item allows you to change the padding character. For example, to print
- a value, right justified, using "*" as the padding character you would use the
- format item "%\*10d". To print it left justified you would use the format item
- "%-\*10d". Note that the "-" must precede the "\*". This is a limitation of
- the current version of the software. The operands must appear in this order.
-
- Normally, the address(es) following the printf format string must be far
- pointers to the actual data to print. On occassion, especially when allocating
- storage on the heap (using malloc), you may not know (at assembly time) the
- address of the object you want to print. You may have only a pointer to the
- data you want to print. The "^" format option tells printf that the far
- pointer following the format string is the address of a pointer to the data
- rather than the address of the data itself. This option lets you access the
- data indirectly.
-
- Examples:
-
- printf
- db "Indirect access to i: %^d",13,10,0
- dd IPtr
- ;
- printf
- db "A string allocated on the heap: %-\.32^s"
- db 13,10,0
- dd SPtr
-
-
- Note: unlike C, stdlib's printf routine does not support floating point
- output. There are two reasons for this: first, stdlib does not (yet) have a
- floating point library associated with it; second, adding floating point
- support would increase the size of printf by a tremendous amount, even if you
- don't use its floating point capabilities. Since most assembly language
- programmers don't use floating point arithmetic, I've intentionally left out
- floating point output. As soon as I add a floating point package to stdlib I
- will include floating point output. However, I will create a new routine,
- printff which includes floating point output. This will allow those who never
- use floating point I/O to keep their programs much smaller.
-
-
- 3 Character Input Routines
-
-
-
- 3.1 Getc
-
- * Reads a character from the standard input device and returns the
- character in the AL register.
-
- * Redirectable under program control.
-
- Inputs: None.
-
- Outputs: AL- Character from input device.
- AH- Undefined. However, if AL contains zero, AH should contain a
- keyboard scan code.
-
- Include: Stdlib.a
-
- This routine reads a character from the standard input device. This call
- is synchronous, that is, it does not return until a character is available.
- Default input device is DOS standard input.
-
- Example:
-
- getc
- mov KbdChar, al
- putc
-
-
- 3.2 GetcStdIn
-
- * Reads a character from the DOS standard input device and returns the
- character in the AL register.
-
- * Redirectable from DOS command line.
-
- Inputs: None.
-
- Outputs: AL- Character from input device.
- AH- Scan code if AL=0.
-
- Include: Stdlib.a
-
- This routine reads a character from the DOS standard input device. This
- call is synchronous, that is, it does not return until a character is
- available.
-
- Example:
-
- GetcStdIn
- mov InputChr, al
- putc
-
-
- 3.3 GetcBIOS
-
- * Reads a character from the keyboard and returns the character in the
- AL register and the scan code in the AH register.
-
- Inputs: None.
-
- Outputs: AL- Character from the keyboard.
- AH- Scan code from the keyboard.
-
- Include: Stdlib.a
-
- This routine reads a character from the keyboard. This call is
- synchronous, that is, it does not return until a character is available.
-
- Example:
-
- GetcBIOS
- mov CharRead, al
- mov ScanCode, ah
- putc
-
-
- 3.4 SetInAdrs
-
- * Lets you set the address of the current input routine.
-
- Inputs: es:di- Address of new input routine.
-
- Outputs: None.
-
- Include: stdlib.a
-
- This routine redirects the stdlib standard input so that it calls the
- routine whose address you pass in es:di. This routine should obtain a
- character (from anywhere) and return the character in AL. If it makes sense do
- do so, it should also return a "scan code" in the AH register. It must
- preserve all other registers.
-
- Example:
-
- mov es, seg NewInputRoutine
- mov di, offset NewInputRoutine
- SetInAdrs
- .
- .
- .
- les di, RoutinePtr
- SetInAdrs
-
-
- 3.5 GetInAdrs
-
- * Retrieves address of the current input routine.
-
- Inputs: None.
-
- Outputs: es:di - address of current input routine (called by Getc).
-
- Include: stdlib.a
-
- You can use this function to get the address of the current input routine,
- perhaps so you can save it or see if it is currently pointing at some
- particular piece of code. If you want to temporarily redirect the input and
- then restore the original input routine, consider using PushInAdrs/PopInAdrs
- described later.
-
- Example:
-
- GetInAdrs
- mov word ptr SaveInAdrs, di
- mov word ptr SaveInAdrs+2, es
-
-
- 3.6 PushInAdrs
-
- * Lets you redirect the standard input device and preserve the
- previous address.
-
- * Saves up to 16 old input routine addresses on an internal stack.
-
- * Restoration is possible using PopInAdrs.
-
- Inputs: es:di- Address of new input routine.
-
- Outputs: Carry=0 if operation successful.
- Carry=1 if there were already 16 items on the stack.
-
- Include: stdlib.a
-
- This routine "pushes" the current input address onto an internal stack and
- then stores the value in es:di into the current input routine pointer. The
- PushInAdrs and PopInAdrs routines let you easily save and redirect the standard
- output and then restore the original output routine address later on.
-
- If you attempt to push more than 16 items on the stack, PushInAdrs will
- ignore your request and return with the carry flag set. If PushInAdrs is
- successful, it will return with the carry flag clear.
-
- Example:
-
- mov es, seg NewInputRoutine
- mov di, offset NewInputRoutine
- PushInAdrs
- .
- .
- .
- les di, RoutinePtr
- PushInAdrs
-
-
- 3.7 PopInAdrs
-
- * Restores output routine addresses saved by PushInAdrs.
-
- * Defaults to GetcStdOut if you attempt to pop too many items off the
- stack.
-
- Inputs: None.
-
- Outputs: es:di- Points at the previous stdout routine before the pop.
-
- Include: stdlib.a
-
- PopInAdrs undoes the effects of PushInAdrs. It pops an item off the
- internal stack and stores it into the input routine pointer. The previous
- value in the output pointer is returned in es:di.
-
- Example:
-
- mov es, seg NewInRoutine
- mov di, offset NewInputRoutine
- PushInAdrs
- .
- .
- .
- PopInAdrs
-
-
- 3.8 Gets
-
- * Reads a line of text from the stdlib standard input device.
-
- * Automatically allocates storage for the input string on the heap.
-
- * Handles input lines up to 256 characters long.
-
- Inputs: None.
-
- Outputs: es:di - address of input of text.
-
- Include: stdlib.a
-
- Gets reads a line of text from the stdlib standard input. It returns a
- pointer to a string containing each character read in the ES:DI registers.
- Gets calls malloc to allocate 256 bytes on the heap (plus any overhead bytes
- required by the memory manager system). If the user enters less than 256
- bytes, gets calls realloc to free any unnecessary bytes. Gets returns all
- characters typed by the user except for the carriage return (ENTER) key code.
- Gets always returns a zero-terminated string. The action of various keys to
- gets depends upon where input has be directed. Generally, you can count on
- gets properly handling the backspace (erase previous character), escape (erase
- entire line), and ENTER (accept line) keys. Other keys may be active as well.
- For example, by default gets calls getc which calls DOS' standard input
- routine. If you type a control-C or break key while reading from DOS' standard
- input it will abort the program. If this bothers you, you can always redirect
- stdlib's getc routine so it calls BIOS directly rather than reading data
- through DOS' keyboard input routine.
-
- Example:
-
- gets ;Read a string from the keyboard
- puts ;Print it
- putcr ;Print a new line
- free ;Deallocate storage for string.
-
-
- 3.9 Scanf
-
- * Formatted input from stdlib standard input.
-
- * Similar to C's scanf routine.
-
- * Converts ASCII to integer, unsigned, character, string, hex, and
- long values of the above.
-
- Inputs: None.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Scanf provides formatted input in a fashion analogous to printf's output
- facilities. Actually, it turns out that scanf is considerably less useful than
- printf because it doesn't provide reasonable error checking facilities (neither
- does C's version of this routine). But for quick and dirty programs whose
- input can be controlled in a rigid fashion (or if you're willing to live by
- "garbage in, garbage out") scanf provides a convenient way to get input from
- the user.
-
- Like printf, the scanf routine expects you to follow the call with a
- format string and then a list of (far pointer) memory addresses. The items in
- the scanf format string take the following form:
-
- %^f
-
- where f represents d, i, x, h, u, c, x, ld, li, lx, or lu. Like printf, the
- "^" symbol tells scanf that the address following the format string is the
- address of a (far) pointer to the data rather than the address of the data
- location itself.
-
- By default, scanf automatically skips any leading whitespace before
- attempting to read a numeric value. You can instruct scanf to skip other
- characters by placing that character in the format string. For example, the
- following call instructs scanf to read three integers separated by commas
- (and/or whitespace):
-
- scanf
- db "%i,%i,%i",0
- dd i1,i2,i3
-
- Whenever scanf encounters a non-blank character in the format string, it
- will skip that character (including multiple occurrences of that character) if
- it appears next in the input stream.
-
- Scanf always calls gets to read a new line of text from stdlib's standard
- input. If scanf exhausts the format list, it ignores any remaining characters
- on the line. If scanf exhausts the input line before processing all of the
- format items, it leaves the remaining variables unchanged. Scanf always
- deallocates the storage allocated by gets.
-
- Example:
-
- scanf
- db "%i %h %^s",0
- dd i, x, sptr
-
-
- 4 Conversion Routines
-
-
-
- 4.1 ATOL/ATOL2
-
- * Converts an ASCII string of digits to long integer format.
-
- Inputs: ES:DI- Points at string to convert.
-
- Outputs: DX:AX- Long integer converted from string.
- Carry flag- Error status
- DI (ATOL2)- First character beyond string of digits.
-
- Include: stdlib.a
-
- ATOL convert the string of digits that ES:DI points at to a long integer
- (signed) value and returns this value in DX:AX. ATOL2 works in a similar
- fashion except it doesn't preserve the DI register. That is, it leaves DI
- pointing at the first character beyond the string of digits. This routine
- returns the carry flag clear if it translated the string of digits witout
- error. It returns the carry flag set if overflow occurred. Note that this
- routine stops on the first non-digit. If the string does not begin with a
- digit, this routine returns zero. The only except to the "string of digits"
- rule is that the number can have a preceding minus sign to denote a negative
- number. In particular, note that this routine does not allow leading spaces.
-
- Example:
-
- gets ;Get a string from user
- atol ;Convert to a value in DX:AX
-
-
- 4.2 ATOUL/ATOUL2
-
- Just like ATOL above, except this guy handles unsigned long integers.
-
-
- 4.3 ATOI
-
- * Converts an ASCII string of digits to integer format.
-
- Inputs: ES:DI- Points at string to convert.
-
- Outputs: AX- Integer converted from string.
- Carry flag- Error status
- DI (ATOI2)- First character beyond string of digits.
-
- Include: stdlib.a
-
- Works just like ATOL except it translates the string to a signed 16-bit
- integer rather than a 32-bit long integer.
-
-
- 4.4 ATOU/ATOU2
-
- * Converts an ASCII string of digits to unsigned integer format.
-
- Inputs: ES:DI- Points at string to convert.
-
- Outputs: AX- Unsigned 16-bit integer converted from string.
- Carry flag- Error status
- DI (ATOU2)- First character beyond string of digits.
-
- Include: stdlib.a
-
- Like ATOI except it handle unsigned 16-bit integers in the range 0..65535.
-
-
- 4.5 ATOH/ATOH2
-
- * Converts an ASCII string of hex digits to a value in AX.
-
- Inputs: ES:DI- Points at string to convert.
-
- Outputs: AX- Unsigned 16-bit integer converted from hex string.
- Carry flag- Error status
- DI (ATOH2)- First character beyond string of hex digits.
-
- Include: stdlib.a
-
- This routine converts a string of hexadecimal digits into numeric form and
- returns that value in the AX register.
-
- Example:
-
- les di, Str2Convrt
- atoh ;Convert to value in AX.
- putw ;Print word in AX.
-
-
- 4.6 ATOLH/ATOLH2
-
- Like ATOH above, except it handles 32-bit values and returns the result in
- DX:AX.
-
-
- 4.7 ITOA
-
- * Converts a 16-bit signed integer value in AX to a string of
- characters.
-
- * Automatically allocates storage for string on the heap.
-
- Inputs: AX- Signed 16-bit value to convert to a string.
-
- Outputs: ES:DI- Pointer to string containing converted characters.
-
- Include: stdlib.a
-
- ITOA converts the signed integer value in AX to a string of characters
- which represent that value. It allocates storage for this string on the heap
- via a call to the malloc routine and returns a pointer to that string in
- ES:DI. The string contains the minimum number of characters required to hold
- the character representation of the value and is always between one and six
- characters long.
-
- Example:
-
- mov ax, -1234
- itoa ;Convert to string.
- puts ;Print it.
- free ;Deallocate string.
-
-
- 4.8 UTOA
-
- * Converts a 16-bit unsigned integer value in AX to a string of
- characters.
-
- * Automatically allocates storage for string on the heap.
-
- Inputs: AX- Unsigned 16-bit value to convert to a string.
-
- Outputs: ES:DI- Pointer to string containing converted characters.
-
- Include: stdlib.a
-
- Like ITOA above, except it converts the unsigned value in AX to a string
- of characters. The string returned by UTOA is always one to five characters
- long.
-
- Example:
-
- mov ax, 65000
- utoa
- puts
- free
-
-
- 4.9 HTOA
-
- * Converts an 8-bit value in AL to the two-character hexadecimal
- representation of that byte.
-
- * Automatically allocates storage for string on the heap.
-
- Inputs: AL- 8-bit value to convert to a string.
-
- Outputs: ES:DI- Pointer to string containing converted characters.
-
- Include: stdlib.a
-
- Converts a byte to a string containing the hexadecimal representation of
- that byte. Otherwise, it's just like ITOA above. This routine always outputs
- exactly two hexadecimal digits, including a leading zero (if necessary).
-
-
-
- 4.10 WTOA
-
- * Converts a 16-bit value in AX to hexadecimal representation.
-
- * Automatically allocates storage for string on the heap.
-
- Inputs: AX- 16-bit value to convert to a string.
-
- Outputs: ES:DI- Pointer to string containing converted characters.
-
- Include: stdlib.a
-
- Like HTOA above, except it converts the 16-bit value in AX to a string of
- four hexadecimal digits. Outputs exactly four digits including leading zeros
- if necessary.
-
-
- 4.11 LTOA
-
- * Converts a 32-bit signed integer value in DX:AX to a string of
- characters.
-
- * Automatically allocates storage for string on the heap.
-
- Inputs: DX:AX- Signed 32-bit value to convert to a string.
-
- Outputs: ES:DI- Pointer to string containing converted characters.
-
- Include: stdlib.a
-
- Like ITOA except it converts a long integer value in DX:AX to a string of
- one to eleven characters.
-
-
- 4.12 ULTOA
-
- * Converts a 32-bit unsigned integer value in DX:AX to a string of
- characters.
-
- * Automatically allocates storage for string on the heap.
-
- Inputs: DX:AX- Unsigned 32-bit value to convert to a string.
-
- Outputs: ES:DI- Pointer to string containing converted characters.
-
- Include: stdlib.a
-
- Like LTOA except this guy handles unsigned integer values.
-
-
- 4.13 SPrintf
-
- * In-memory formatting routine.
-
- * Just like C's sprintf routine.
-
- * Automatically allocates storage for the string on the heap.
-
- * Programmer selectable maximum length for the output string.
-
- Inputs: CS:RET- Pointer to format string and operands of the sprintf
- routine.
-
- Outputs: ES:DI- Pointer to string containing output data.
-
- Include: stdlib.a
-
- Works in a manner quite similar to printf except sprintf writes its output
- to a string variable rather than to the stdlib standard output. Sprintf
- returns a pointer to the string (which is allocates on the heap) in the ES:DI
- registers. SPrintf, by default, allocates 2048 characters for this string and
- then deallocates any unnecessary storage. An external variable, sp_MaxBuf,
- holds the number of bytes to allocate upon entry into sprintf. If you wish to
- allocate more or less than 2048 bytes when calling sprintf, simply change the
- value of this public variable (type is word). Sprintf calls malloc to allocate
- the storage dynamically. You should call free to return this buffer to the
- heap when you are through with it.
-
-
- Example:
-
- sprintf
- db "I=%i, U=%u, S=%s",13,10,0
- db i,u,s
- puts
- free
-
-
- 4.14 SBPrintf
-
- * In-memory formatting routine.
-
- * Programmer-supplied output buffer for string
-
- Inputs: CS:RET- Pointer to format string and operands of the sprintf
- routine.
- ES:DI- Pointer to buffer area to store string data.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Works just like sprintf except it does not automatically allocate storage
- for the output string. Instead, you must supply the address of an output
- buffer in the ES:DI registers.
-
- Example:
-
- les di, BufferAdrs
- sbprintf
- db "I=%i, U=%u, S=%s",13,10,0
- db i,u,s
- puts
-
-
- 4.15 SScanf
-
- * Formatted in-memory conversions.
-
- * Similar to C's sscanf routine.
-
- * Converts ASCII to integer, unsigned, character, string, hex, and
- long values of the above.
-
- Inputs: ES:DI- Points at string containing values to convert.
- CS:RET- Points at format string and variable parameter list.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Sscanf provides formatted input in a fashion analogous to scanf. The
- difference is that scanf reads a line of text from the stdlib standard input
- whereas you pass the address of a sequence of characters to sscanf in es:di.
-
- Example:
-
- ;
- ; This code reads the values for i, j, and s from the characters
- ; starting at memory locaiton Buffer.
- ;
- les di, Buffer
- sscanf
- db "%i %i %s",0
- dd i,j,s
-
-
- 4.16 ToLower
-
- * Converts uppercase characters in AL to lower case.
-
- * Macro implementation for high performance.
-
- * Leaves characters other than uppercase unchanged.
-
- Inputs: AL- Character to (possibly) convert to lower case.
-
- Outputs: AL- Converted character.
-
- Include: stdlib.a
-
- ToLower checks the character in the AL register. If it is upper case it
- converts it to lower case. If it is anything else, ToLower leaves the value in
- AL unchanged. Note: this routine is implemented as a macro rather than as a
- procedure call. This routine is so short you would spend more time actually
- calling the routine than executing the code inside. However, the code is
- definitely longer than a (far) procedure call, so if space is critical and
- you're invoking this code several times, you may want to convert it to a
- procedure call to save a little space.
-
- Example:
-
- mov al, char
- ToLower
-
-
- 4.17 ToUpper
-
- * Converts lowercase characters in AL to upper case.
-
- * Macro implementation for high performance.
-
- * Leaves characters other than lowercase unchanged.
-
- Inputs: AL- Character to (possibly) convert to upper case.
-
- Outputs: AL- Converted character.
-
- Include: stdlib.a
-
- This is just like the ToLower routine except it converts lower case to
- uppercase rather than vice versa.
-
-
- 5 Utility Routines
-
-
-
- 5.1 ISize
-
- * Computes the number of print positions required by a 16-bit signed
- integer value.
-
- Inputs: AX- 16-bit value to compute the output size for.
-
- Outputs: AX- Number of print positions required by this number (including the
- minus sign, if necessary).
-
- Include: stdlib.a
-
- ISize computes the minimum number of character positions it will take to
- print the signed decimal value in the AX register. If the number is negative,
- it will include space for the minus sign in the count.
-
- Example:
-
- mov ax, I
- ISize
- puti ;Prints positions req'd by I.
-
-
- 5.2 USize
-
- Just like ISize above, except this guy returns the number of print
- positions required by a 16-bit unsigned value.
-
-
- 5.3 LSize
-
- * Computes the number of print positions required by a 32-bit signed
- integer value.
-
- Inputs: DX:AX- 32-bit value to compute the output size for.
-
- Outputs: AX- Number of print positions required by this number (including the
- minus sign, if necessary).
-
- Include: stdlib.a
-
- LSize computes the minimum number of character positions it will take to
- print the signed decimal value in the DX:AX registers. If the number is
- negative, it will include space for the minus sign in the count.
-
- Example:
-
- mov ax, word ptr L
- mov dx, word ptr L+2
- LSize
- puti ;Prints positions req'd by L.
-
-
- 5.4 ULSize
-
- As with LSize, except ULSize treats the value in DX:AX as an unsigned long
- integer.
-
-
- 5.5 IsAlNum
-
- * Checks character in AL to see if it is alphanumeric.
-
- Inputs: AL- Character to check.
-
- Outputs: Zero flag- Set if character is alphanumeric, clear if not.
-
- Include: stdlib.a
-
- This procedure checks the character in the AL register to see if it is in
- the range A-Z, a-z, or 0-9. Upon return, you can use the JE instruction to
- check to see if the character was in this range (or, conversely, you can use
- jne to see if it is not in the range).
-
- Example:
-
- mov al, char
- IsAlNum
- je IsAlNumChar
-
-
- 5.6 IsXDigit
-
- * Checks character in AL to see if it is a hexadecimal digit.
-
- Inputs: AL- Character to check.
-
- Outputs: Zero flag- Set if character is a hex digit, clear if not.
-
- Include: stdlib.a
-
- This procedure checks the character in the AL register to see if it is in
- the range A-F, a-f, or 0-9. Upon return, you can use the JE instruction to
- check to see if the character was in this range (or, conversely, you can use
- jne to see if it is not in the range).
-
- Example:
-
- mov al, char
- IsXDigit
- je IsXDigitChar
-
-
- 5.7 IsDigit
-
- * Checks character in AL to see if it is numeric.
-
- * Macro implementation for high performance.
-
- Inputs: AL- Character to check.
-
- Outputs: Zero flag- Set if character is numeric, clear if not.
-
- Include: stdlib.a
-
- This procedure checks the character in the AL register to see if it is in
- the range 0-9. Upon return, you can use the JE instruction to check to see if
- the character was in this range (or, conversely, you can use jne to see if it
- is not in the range).
-
- Example:
-
- mov al, char
- IsDigit
- je IsDecChar
-
-
- 5.8 IsAlpha
-
- * Checks character in AL to see if it is alphabetic.
-
- * Macro implementation for high performance.
-
- Inputs: AL- Character to check.
-
- Outputs: Zero flag- Set if character is alphabetic, clear if not.
-
- Include: stdlib.a
-
- This procedure checks the character in the AL register to see if it is in
- the range A-Z, or a-z. Upon return, you can use the JE instruction to check to
- see if the character was in this range (or, conversely, you can use jne to see
- if it is not in the range).
-
- Example:
-
- mov al, char
- IsAlpha
- je IsAlChar
-
-
- 5.9 IsLower
-
- * Checks character in AL to see if it is a lower case alphabetic
- character.
-
- * Macro implementation for high performance.
-
- Inputs: AL- Character to check.
-
- Outputs: Zero flag- Set if character is lower case alphabetic, clear if not.
-
- Include: stdlib.a
-
- This procedure checks the character in the AL register to see if it is in
- the range a-z. Upon return, you can use the JE instruction to check to see if
- the character was in this range (or, conversely, you can use jne to see if it
- is not in the range).
-
- Example:
-
- mov al, char
- IsLower
- je IsLowerChar
-
-
- 5.10 IsUpper
-
- * Checks character in AL to see if it is uppercase alphabetic.
-
- * Macro implementation for high performance.
-
- Inputs: AL- Character to check.
-
- Outputs: Zero flag- Set if character is uppercase alpha, clear if not.
-
- Include: stdlib.a
-
- This procedure checks the character in the AL register to see if it is in
- the range A-Z. Upon return, you can use the JE instruction to check to see if
- the character was in this range (or, conversely, you can use jne to see if it
- is not in the range).
-
- Example:
-
- mov al, char
- IsUpper
- je IsUpperChar
-
-
- 6 Memory Management
-
- The stdlib memory management routines let you dynamically allocate storage
- on the heap. These routines are somewhat similar to those provided by the "C"
- programming language. These routines do not perform garbage collection. Doing
- so would introduce too many restrictions. Of course, feel free to add your own
- garbage collection if you like...
-
- The allocation/deallocation routines should be fairly fast. Malloc and
- free use a modified first/next fit algorithm which lets the system quickly find
- a memory block of the desired size without undue fragmentation problems
- (average case). The overhead (eight bytes) per allocated block may seem rather
- high, but that is part of the price to pay for faster malloc and free routines.
-
- The memory manager data structure has an overhead of eight bytes (meaning
- each malloc operation requires at least eight more bytes than you ask for) and
- a granularity of 16 bytes. All pointers are far pointers and I allocate each
- new item on a paragraph boundary. The current memory manager routines always
- allocates (n+8) bytes, rounding up to the next multiple of 16 if the result is
- not evenly divisible by sixteen. The first eight bytes of the structure are
- used by the memory management routines, the remaining bytes are available for
- use by the caller (malloc, et. al., return a pointer to the first byte beyond
- the memory management overhead structure). Of course, you should never count
- on any of this stuff. I could rewrite the memory manager tomorrow and if you
- use the interface which follows your code will still work properly. If you
- make assumptions about the structure of the memory management record, your code
- may go up in flames on the next revision.
-
-
- 6.1 MemInit
-
- * Initializes memory manager system.
-
- Inputs: DX- Number of paragraphs to reserve.
- zzzzzzseg- Segment name of last segment in your program.
- PSP- Public word variable which holds the PSP value for your
- program.
-
- Outputs: CX- Number of paragraphs actually reserved by MemInit
- Carry=0 if no error. If carry=1, AX contains DOS error code.
-
- Include: stdlib.a
-
- This routine initializes the memory manager system. You must call it
- before using any routines which call any of the memory manager procedures
- (since a good number of the stdlib routines call the memory manager, you should
- get it the habit of always calling this routine. The system will die a
- horrible death if you call a memory manager routine (like malloc) without first
- calling MemInit.
-
- This routine excepts you to define (and set up) two global names:
- zzzzzzseg and PSP. "zzzzzzseg" is a dummy segment which must be the name of
- the very last segment defined in your program. MemInit uses the name of this
- segment to determine the address of the last byte in your program. If you do
- not declare this segment last, the memory manager will happily wipe out
- anything which follows zzzzzzseg. The "shell.asm" file provides you with a
- template for your programs which properly defines this segment.
-
- PSP should be a word variable which contains the program segment prefix
- value for your program. MS-DOS passes the PSP value to your program in the DS
- and ES registers. You should save this value in the PSP variable. Don't
- forget to make PSP a public symbol in your main program's source file. The
- "shell.asm" file demonstrates how to properly set up this value.
-
- The DX register contains the number of 16-byte paragraphs you want to
- reserve for the heap. If DX contains zero, MemInit will allocate all of the
- available memory to the heap. If your program is going to allow the user to
- run a copy of the command interpreter, or if your program is going to EXEC some
- other program, you should not allocate all storage to the heap. Instead, you
- should reserve some memory for those programs. By setting DX to some value
- other than zero, you can tell MemInit how much memory you want to reserve for
- the heap. All left over memory will be available for other system (or program)
- use.
-
- If the value is DX is larger than the amount of available RAM, MemInit
- will split the available memory in half and reserve half for the heap leaving
- the other half unallocated. If you want to force this situation (to leave half
- of available memory for other purposes), simply load DX with 0FFFFh before
- calling MemInit. There will never be this much memory available, so this will
- force MemInit to split the available RAM between the heap and unallocated
- storage.
-
- On return from MemInit, the CX register contains the number of paragraphs
- actually allocated. You can use this value to see if MemInit has actually
- allocated the number of paragraphs you requested. You can also use this value
- to determine how much space is available when you elect to split the free space
- between the heap and the unallocated portions.
-
- If all goes well, this routine returns the carry flag clear. If a DOS
- memory manager error occurs, this routine returns the carry flag set and the
- DOS error code in the AX register.
-
- Example:
-
- ;
- ; Don't forget to set up PSP and zzzzzzseg before calling MemInit.
- ;
- mov dx, dx ;Allocate all available RAM
- MemInit
- jc MemoryError
- ;
- ; cx contains the number of paragraphs actually allocated.
- ;
-
-
- 6.2 Malloc
-
- * Allocates storage from the heap.
-
- * Allocates blocks up to 64K long.
-
- * Very fast combination first/next fit allocation strategy
-
- Inputs: CX- Number of bytes to reserve.
-
- Outputs: CX- Number of bytes actually reserved by malloc.
- ES:DI- Pointer to first byte of memory allocated by malloc.
- Carry=0 if no error. Carry=1 if insufficient memory
-
- Include: stdlib.a
-
- Malloc is the workhorse routine you use to allocate a block of memory.
- You give it the number of bytes you need and if it finds a block large enough,
- it will allocate the requested amount and return a pointer to that block.
-
- Most memory managers require a small amount of overhead for each block
- they allocate. Stdlib's (current) memory manager requires an overhead of eight
- bytes. Furthermore, the grainularity is 16 bytes. This means that malloc
- always allocates blocks of memory in paragraph multiples. Therefore, malloc
- may actually reserve more storage than you specify. Therefore, the value
- returned in CX may be somewhat greater than the requested value. By setting
- the minimum allocation size to a paragraph, I was able to reduce the overhead
- and improve the speed of malloc by a considerable amount.
-
- Stdlib's memory management system does not do any garbage collection.
- Doing so would place too many demands on malloc's users. Therefore, it is
- quite possible for you to fragment memory with multiple calls to malloc,
- realloc, and free. You could wind up in a situation where there is enough free
- memory to satisfy your request, but there isn't a single contiguous block large
- enough for the request. Malloc treats this as an insufficient memory error and
- returns with the carry flag set.
-
- If malloc cannot allocate a block of the requested size, it returns with
- the carry flag set. In this situation, the contents of ES:DI is undefined.
- Attempting to dereference this pointer will produce erratic and, perhaps,
- disasterous results.
-
- Example:
-
- mov cx, 256
- malloc
- jnc GoodMalloc
- print
- db "Insufficient memory to continue.",cr,lf,0
- jmp Quit
- GoodMalloc: mov es:[di], 0 ;Init string to NULL.
-
-
- 6.3 Realloc
-
- * Reallocates a block of memory on the heap.
-
- * Allocates blocks up to 64K long.
-
- * Allows you to make the new block smaller or larger than the original
- block.
-
- * Automatically copies the data from the original block to the new
- block if the new block is larger than the old block.
-
- Inputs: CX- Number of bytes to reserve.
- ES:DI- Pointer to block to reallocate.
-
- Outputs: CX- Number of bytes actually reserved by realloc.
- ES:DI- Pointer to first byte of memory allocated by realloc.
- Carry=0 if no error. Carry=1 if insufficient memory
-
- Include: stdlib.a
-
- Realloc lets you change the size of an allocated block in the heap. It
- allows you to make the block larger or smaller. If you make the block smaller,
- realloc simply frees (returns to the heap) any leftover bytes at the end of the
- block. If you make the block larger, realloc goes out and allocates a block of
- the requested size, copies the bytes from the old block to the beginning of the
- new block (leaving the bytes at the end of the new block uninitialized), and
- then frees the old block.
-
-
- 6.4 Free
-
- * Deallocates a block of memory on the heap.
-
- * Automatically coalesces all contiguous, unused, blocks on the heap.
-
- * Very fast algorithm.
-
- * Handles the situation where several active pointers may still point
- at the specified block.
-
- Inputs: ES:DI- Pointer to block to deallocate.
-
- Outputs: Carry=0 if no error. Carry=1 if es:di doesn't point at a free
- block.
-
- Include: stdlib.a
-
- Free (possibly) deallocates storage allocated on the heap by malloc or
- realloc. Free returns this storage to the heap so other code can reuse it
- later. Note, however, that free doesn't always return storage to the heap.
- The memory manager data structure keeps track of the number of pointers
- currently pointing at a block on the heap (see DupPtr, below). If you've set
- up several pointers such that they point at the same block, free will not
- deallocate the storage until you've freed all of the pointers which point at
- that block.
-
- Free usually returns an error code (carry flag = 1) if you attempt to free
- a block which is not currently allocated or if you pass it a memory address
- which was not returned by malloc (or realloc). By no means is this routine
- totally robust. If you start calling free with arbitrary pointers in es:di
- (which happen to be pointing into the heap) it is possible, under certain
- circumstances, to confuse free and it will attempt to free a block it really
- shouldn't. I could fix this problem by adding a lot of (slow) code to the free
- routine. However, this library is for assembly language programmers. People
- who are supposed to know what they are doing. Therefore, I opted to sacrifice
- a little safety for a lot of speed.
-
- Example:
-
- les di, HeapPtr
- free
-
-
- 6.5 DupPtr
-
- * Informs the memory manager that you have more than one active
- pointer pointing at a block of memory.
-
- * Prevents free from deallocating storage to a block while there are
- still some active pointers to that block.
-
- Inputs: ES:DI- Pointer to block.
-
- Outputs: Carry=0 if no error. Carry=1 if es:di doesn't point at a free
- block.
-
- Include: stdlib.a
-
- DupPtr increments the pointer count for the block at the specified
- address. Malloc sets this counter to one. Free decrements it by one. If free
- decrements the value and it becomes zero, free will release the storage to the
- heap for other use. By using DupPtr you can tell the memory manager that you
- have several pointers pointing at the same block and that it shouldn't
- deallocate the storage until you free all of those pointers.
-
- Example:
-
- les di, Ptr
- DupPtr
-
-
- 6.6 IsInHeap
-
- * Tells you if a pointer contains the address of a byte in the heap.
-
- Inputs: ES:DI- Pointer to block.
-
- Outputs: Carry=0 if es:di points into the heap. Carry=1 if not.
-
- Include: stdlib.a
-
- This routine lets you know if es:di contains the address of a byte in the
- heap somewhere. It does not tell you if es:di contains a valid pointer
- returned by malloc (see IsPtr, below). For example, if es:di contains the
- address of some particular element of an array (not necessarily the first
- element) allocated on the heap, IsInHeap will return with the carry clear
- denoting that the es:di point somewhere in the heap. Keep in mind, that
- calling this routine does not validate the pointer. It could be pointing at a
- byte which is part of the memory manager data structure rather than at actual
- data (since the memory manager maintains that information within the bounds of
- the heap). This routine is mainly useful for seeing if something is allocated
- on the heap as opposed to somewhere else (like your code, data, or stack
- segment).
-
-
- 6.7 IsPtr
-
- * Tells you if a pointer contains the address of the start of a block
- in the heap.
-
- Inputs: ES:DI- Pointer to block.
-
- Outputs: Carry=0 if es:di is a valid pointer. Carry=1 if not.
-
- Include: stdlib.a
-
- IsPtr is much more specific than IsInHeap. This guy returns the carry
- flag clear if and only if es:di contains the address of a properly allocated
- (and currently allocated) block on the heap. This pointer must be a value
- returned by malloc, realloc, or DupPtr and that block must be currently
- allocated for IsPtr to return the carry flag clear.
-
-
- 7 String Routines
-
- The stdlib string package supports "C" style zero-terminated strings.
- Most of these routines mirror their "C" counterpart. Of course, I've added a
- few additional routines which seem useful to me.
-
-
- 7.1 Strcpy, Strcpyl
-
- * Copies a zero terminated string from one buffer to another.
-
- * Does not require the use of the DS segment register.
-
- Inputs: ES:DI- Pointer to source string (Strcpy only).
- CS:RET- Pointer to source string (Strcpyl only).
- DX:SI- Pointer to destination string.
-
- Outputs: ES:DI- Points at the destination string.
-
- Include: stdlib.a
-
- Strcpy is used to copy a zero-terminated string from one location to
- another. ES:DI points at the source string, DX:SI points at the destination
- address. Strcpy copies all bytes, up to and including the zero byte, from the
- source address to the destination address. The target buffer must be large
- enough to hold the string. Strcpy performs no error checking on the size of
- the destination buffer.
-
- Strcpyl copies the zero-terminated string immediately following the call
- instruction to the destination address specified by DX:SI. Again, this routine
- expects you to ensure that the target buffer is large enough to hold the
- result.
-
- Examples:
-
- mov dx, seg target
- mov si, offset target
- Strcpyl
- db "String for Strcpyl",0
- ;
- ; Copy that string to Target2 as well, note that ES:DI already points
- ; at "Target".
- ;
- mov dx, seg Target2
- mov si, offset Target2
- Strcpy
-
-
- 7.2 StrDup, StrDupl
-
- * Duplicates a string by copying a zero-terminated string from one
- location to a newly allocated spot on the heap.
-
- * Automatically allocates sufficient storage for destination string on
- the heap.
-
- * Does not require the use of the DS segment register.
-
- Inputs: ES:DI- Pointer to source string (Strdup only).
- CS:RET- Pointer to source string (Strdupl only).
-
- Outputs: ES:DI- Points at the destination string allocated on heap.
- Carry=0 if operation successful. Carry=0 if insufficient memory for
- new string.
-
- Include: stdlib.a
-
- Strdup and strdupl duplicate strings. You pass them a pointer to the
- string (in es:di for strdup, via the return address for strdupl) and they
- allocate sufficient storage on the heap for a copy of this string. Then these
- two routines copy their source strings to the newly allocated storage and
- return a pointer to the new string in ES:DI.
-
- Examples:
-
- Strdupl
- db "String for Strdupl",0
- jc MallocError
- mov word ptr Dest1, di
- mov word ptr Dest1+2, es
- ;
- ; Create another copy of this string. Note that es:di points at
- ; Dest1 upon entry to Strdup, but it points at the new string on
- ; exit.
- ;
- Strdup
- jc MallocError
- mov word ptr Dest2, di
- mov word ptr Dest2+2, es
-
-
- 7.3 Strlen
-
- * Computes the length of a zero terminated string.
-
- Inputs: ES:DI- Pointer to source string.
-
- Outputs: CX- Length of specified string.
-
- Include: stdlib.a
-
- Strlen computes the length of the string whose address appears in ES:DI.
- It returns the number of characters up to, but not including, the zero
- terminating byte.
-
- Example:
-
- les di, String
- strlen
- mov sl, cx
- printf
- db "Length of '%s' is %d\n",0
- dd String, sl
-
-
- 7.4 Strcat, Strcat2, Strcatl, Strcat2l
-
- * Concatenates one string to the end of another.
-
- * Strcatl and Strcat2l allow literal string operands.
-
- * Strcat2 and Strcat2l automatically allocate storage for destination
- string.
-
- Inputs: ES:DI- Pointer to first string.
- DX:SI- Pointer to second string (Strcat & Strcat2 only).
-
- Outputs: ES:DI- Pointer to new string (Strcat2 & StrCat2l only).
- Carry=0 No error. Carry=1 Insufficient memory (Strcat2 & StrCat2l
- only).
-
- Include: stdlib.a
-
- These routines concatenate two strings together. They differ mainly in
- the location of their source and destination operands.
-
- Strcat concatenates the string pointed at by DX:SI to the end of the
- string pointed at by ES:DI in memory (both strings must be zero-terminated).
- The buffer pointed at by ES:DI must be large enough to hold the resulting
- string. Strcat performs no bounds checking on the data.
-
- Strcat2 works just like strcat except it does not append the second string
- on to the end of the first. Instead, Strcat2 computes the length of the two
- strings and attempts to allocate this much storage on the heap. If it is
- unsuccessful, Strcat2 returns with the carry flag set. If it successfully
- allocates this storage on the heap, it copies the string pointed at by es:di to
- the heap and then concatenates the string dx:si points at to the end of this
- string on the heap and returns with the carry flag clear and es:di pointing at
- the new string on the heap.
-
- Strcatl and Strcat2l work just like Strcat and Strcat2 except you supply
- the second string as a literal constant immediately after the call rather than
- pointing dx:si at it.
-
- Examples:
-
- les di, String1
- mov dx, seg String2
- lea si, String2
- Strcat ;String1 <- String1 + String2
- ;
- les di, String1
- Strcatl
- db "Appended String",0
- ;
- les di, String1
- mov dx, seg String2
- lea si, String2
- Strcat2 ;NewString<-String1+String2
- puts
- free
- ;
- les di, String1
- Strcat2l
- db "Appended String",0
- puts
- free
-
-
- 7.5 Strchr
-
- * Searches for a single character inside a string.
-
- Inputs: ES:DI- Pointer to string.
- AL- Character to search for.
-
- Outputs: CX- Position (starting at zero) where Strchr found the character.
- Carry=0 if Strchr found the character.
- Carry=1 if the character wasn't present in the string.
-
- Include: stdlib.a
-
- Strchr locates the first occurrence of a character within a string. It
- searches through the zero-terminated string pointed at by es:di for the
- character passed in AL. If it locates the character, it returns the position
- of that character in the CX register. The first character in the string
- corresponds to location zero. If the character is not in the string, Strchr
- returns the carry flag set. CX's value is undefined in that case. If Strchr
- locates the character in the string, it returns with the carry flag clear.
-
- Example:
-
- les di, String
- mov al, Char2Find
- strchr
- jc NotPresent
- mov CharPosn, cx
-
-
- 7.6 Strstr, Strstrl
-
- * Searches for a substring inside another string.
-
- Inputs: ES:DI- Pointer to string.
- DX:SI- Pointer to substring (strstr).
- CS:RET- Pointer to substring (strstrl).
-
- Outputs: CX- Position (starting at zero) where Strstr/Strstrl found the
- character.
- Carry=0 if Strstr/Strstrl found the character.
- Carry=1 if the character wasn't present in the string.
-
- Include: stdlib.a
-
- Strstr searches for the position of a substring within another string.
- ES:DI points at the string to search through, DX:SI points at the substring.
- Strstr returns the index into ES:DI's string where DX:SI's string is found. If
- the string is found, Strstr returns with the carry flag clear and CX contains
- the (zero-based) index into the string. If Strstr cannot locate the substring
- within the string ES:DI points at, it returns the carry flag set.
-
- Strstrl works just like Strstr except it expects the substring to search
- for immediately after the call instruction (rather than passing this address in
- DX:SI).
-
- Examples:
-
- les di, MainString
- lea si, Substring
- mov dx, seg Substring
- strstr
- jc NoMatch
- mov i, cx
- printf
- db "Found the substring '%s' at location %i\n",0
- dd Substring, i
- jmp Done
- ;
- NoMatch: print
- db "Could not find the substring.",cr,lf,0
- Done: les di, MainString
- strstrl
- db "test",0
- jc NoMatch2
- print "Found 'test' in the string",cr,lf,0
- jmp Done2
- ;
- NoMatch2: print
- db "Did not find 'test' in the string",cr,lf,0
- Done2:
-
-
- 7.7 Strcmp, Strcmpl
-
- * Compares two strings.
-
- * Reflects comparison in 8086 condition code flags.
-
- Inputs: ES:DI- Pointer to first string.
- DX:SI- Pointer to second string (strcmp).
- CS:RET- Pointer to substring (strcmpl).
-
- Outputs: CX- Position (starting at zero) where the two strings differ.
- Flags- hold the result of the comparison (should use unsigned
- branches).
-
- Include: stdlib.a
-
- Strcmp and strcmpl compare two strings. Strcmp compares the string which
- es:di points at to the string which dx:si points at. Strcmpl compares the
- string which es:di points at to the string immediately following the call
- instruction in the code stream. Strcmp(l) reflects the status of this
- comparison in the flags register. Immediately upon return from strcmp(l) you
- can use the unsigned jump instructions to test the comparison between the two
- strings. Also (upon return), the CX register contains the index into the
- strings where they are different (if the two strings are equal, Strcmp(l)
- returns with CX containing the offset of the zero byte in the two strings.
-
- Examples:
-
- les di, String1
- mov dx, seg String2
- lea si, String2
- strcmp
- jae s1GEs2
- mov i, cx
- printf
- db "String1 is less than String2 and they "
- db "differ at position %i\n",0
- dd i
- ;
- les di, String3
- strcmpl
- db "Hello",0
- jbe S3BEHello
- ;
-
- 7.8 Stricmp, Stricmpl
-
- * Compares two strings ignoring differences in alphabetic case.
-
- * Reflects comparison in 8086 condition code flags.
-
- Inputs: ES:DI- Pointer to first string.
- DX:SI- Pointer to second string (stricmp).
- CS:RET- Pointer to substring (stricmpl).
-
- Outputs: CX- Position (starting at zero) where the two strings differ.
- Flags- hold the result of the comparison (should use unsigned
- branches).
-
- Include: stdlib.a
-
- Stricmp and Stricmpl work just like Strcmp and Strcmpl except that these
- two routines are case insenstive. Strcmp and Strcmpl treat "GETS" and "gets"
- as different strings. Stricmp and Stricmpl treat these two strings as equal.
-
-
- 7.9 Strupr, Strupr2
-
- * Converts all of the lower case characters in a string to upper case.
-
- * Converts the characters in place (Strupr) or creates a new string on
- the heap for the converted string (Strupr2).
-
- Inputs: ES:DI- Pointer to string.
-
- Outputs: ES:DI- Pointer to new string on heap (Strupr2 only).
- Carry=1 if memory allocation error (Strupr2 only).
-
- Include: stdlib.a
-
- Strupr and Strupr2 convert the alphabetic characters in a string to upper
- case. You pass the address of the string containing the characters you want to
- convert in ES:DI. Strupr converts the characters in place. That is, it will
- actually modify the string you pass to it. Strupr2 first calls strdup to
- duplicate the string (on the heap) and then it converts the characters in this
- duplicate string to upper case, returning the pointer to the new string is
- ES:DI.
-
- Examples:
-
- les di, Str2Cnvrt
- strupr
- les di, Str2Cnvrt
- puts
- les di, Str2Cnvrt2
- strupr2
- puts
- free
-
-
- 7.10 Strlwr, Strlwr2
-
- * Converts all of the upper case characters in a string to lower case.
-
- * Converts the characters in place (Strlwr) or creates a new string on
- the heap for the converted string (Strlwr2).
-
- Inputs: ES:DI- Pointer to string.
-
- Outputs: ES:DI- Pointer to new string on heap (Strlwr2 only).
- Carry=1 if memory allocation error (Strlwr2 only).
-
- Include: stdlib.a
-
- Strlwr and Strlwr2 convert the alphabetic characters in a string to lower
- case. You pass the address of the string containing the characters you want to
- convert in ES:DI. Strlwr converts the characters in place. That is, it will
- actually modify the string you pass to it. Strlwr2 first calls strdup to
- duplicate the string (on the heap) and then it converts the characters in this
- duplicate string to lower case, returning the pointer to the new string is
- ES:DI.
-
- Examples:
-
- les di, Str2Cnvrt
- strlwr
- les di, Str2Cnvrt
- puts
- les di, Str2Cnvrt2
- strlwr2
- puts
- free
-
-
- 7.11 Strset, Strset2
-
- * Initializes all the characters in a string to a single value.
-
- * Automatically allocates storage on the heap for the string (Strset2
- only).
-
- Inputs: ES:DI- Pointer to string (Strset only)
- AL- Character to initialize the string with.
- CX- Length of string (Strset2 only).
-
- Outputs: ES:DI- Pointer to new string on heap (Strset2 only).
- Carry=1 if memory allocation error (Strset2 only).
-
- Include: stdlib.a
-
- Strset and Strset2 initialize strings such that each element of the string
- contains the same value (passed in AL). Strset overwrites the data in an
- existing string, replacing the characters previously in the string. To use
- Strset, simply load ES:DI with the address of a string, load AL with the
- character you want to overwrite the string with, and then call Strset. Strset
- will replace each existing character (up to the zero terminating byte) of the
- string with the character in AL.
-
- Strset2 lets you create a brand-new string. You pass the initialization
- character in AL and the length of the string in CX. Strset2 allocates CX+1
- bytes on the heap and initializes the first CX bytes to the value in AL. It
- stores a zero in the last memory location.
-
- Examples:
-
- lesi di, Str2Cnvrt
- mov al, '*'
- Strset
- ;
- mov al, '#'
- mov cx, 32
- Strset2
- puts
- free
- ;
-
- 7.12 Strspan, Strspanl
-
- * Allows you to skip over successive characters in a string.
-
- * Very compact implementation.
-
- Inputs: ES:DI- Pointer to string to scan.
- DX:SI- Pointer to character set (Strspan only).
- CS:RET- Pointer to character set (Strspanl only).
-
- Outputs: First position where Strspan(l) could not find a character in the
- attendant character set. Points at the zero terminating byte of the
- string if all of the characters in the string were present in the
- set.
-
- Include: stdlib.a
-
- Strspan(l) scans a string counting the number of characters which are
- present in a second string (which represents a character set). While each
- successive character in the source string is present in the character set,
- Strspan(l) advances past it. ES:DI points at a zero-terminated string of
- characters to check. DX:SI (strspan) or CS:RET (strspanl) points at another
- zero-terminated string containing the set of characters to compare against.
- While the character that ES:DI points at is present (anywhere) in the character
- set string, the routine advances to the next character and bumps a counter by
- one. Upon encountering a character which is not in the character set string,
- the routine terminates and returns the number of characters (i.e., an index
- into the string) where the mismatch occurred.
-
- Although strspan (and, especially, strspanl) is very compact and
- convenient to use, it is not particularly efficient. The character set
- routines described in the next section provide a much faster alternative at the
- expense of a little more space.
-
- Examples:
-
- les di, String
- mov dx, seg CharSet
- lea si, CharSet
- strspan
- mov i, cx
- printf
- db "The first char which is not in CharSet "
- db "occurs at position %d in String.\n",0
- dd i
- ;
- les di, String
- db "aeiou",0
- mov j, cx
- printf
- db "The first char which is not a vowel "
- db "occurs at position %d in String.\n",0
- dd j
-
-
- 7.13 Strcspan, Strcspanl
-
- * Allows you to skip past characters in a string which are not members
- of a particular character set.
-
- Inputs: ES:DI- Pointer to string to scan.
- DX:SI- Pointer to character set (Strcspan only).
- CS:RET- Pointer to character set (Strcspanl only).
-
- Outputs: First position where Strcspan(l) found a character in the attendant
- character set. Points at the zero terminating byte of the string if
- none of the characters in the string were in the set.
-
- Include: stdlib.a
-
- These two routines work just like strspan and strspanl except they skip
- over characters which are not in the set rather than skipping over characters
- that are in the associated character set.
-
-
- 8 Character Set Routines
-
- The character set routines let you deal with groups of characters as a set
- rather than a string. A set is an unordered collection of objects where
- membership (presence or absence) is the only important quality. I designed the
- stdlib set routines to let you quickly check to see if an ASCII character is in
- a set, to quickly add characters to a set or remove characters from a set.
- These operations are the ones most commonly used on character sets. The other
- operations (like union, intersection, difference, etc.) are useful, but don't
- enjoy the popularity of use as the former routines. Therefore, I've optimized
- the data structure for sets to handle the membership and add/delete operations
- at the slight expense of the others.
-
- Character sets are implemented via bit vectors. A "1" bit means that an
- item is present in the set and a "0" bit means that the item is absent from the
- set. The most common implementation of a character set is to use 32
- consecutive bytes, eight bits per, giving 256 bits (one bit for each character
- in the character set). While this makes certain operations (like assignment,
- union, intersection, etc.) fast and convenient. Other operations (membership,
- add/remove items), however, run much slower. Since these are the more
- important operations, I've chosen a different data structure to represent
- sets. A faster approach is to simply use a byte value for each item in the
- set. This offers a major advantage over the 32-bit scheme: for operations like
- membership it's very fast (since all you've got to do is index into an array
- and test the resulting value). It has two drawbacks: first, operations like
- set assignment, union, difference, etc., require 256 operations rather than
- 32. Second, it takes eight times as much memory.
-
- The first drawback, speed, is of little consequence. You'll rarely use
- the operations so affected, so the fact that they run a little slower will be
- of little consequence. Wasting 224 bytes is a problem however. Especially if
- you have a lot of character sets.
-
- The approach I've used is to allocate 272 bytes. The first eight bytes
- contain bit masks, 1, 2, 4, 8, 16, 32, 64, and 128. These masks tell you which
- bit in the following 264 bytes is associated with the set. This lets me pack
- eight sets into 272 bytes (34 bytes per character set). This provides almost
- the speed of the 256-byte set with only a two byte overhead.
-
- In the stdlib.a file there is a macro that lets you defined a group of
- character sets: set. You use the macro as follows:
-
- set set1, set2, set3, ..., set8
-
- You must supply between one and eight labels in the operand field. These
- are the names of the sets you want to create. The set macro automatically
- attaches these labels to the appropriate mask bytes in the set. The actual bit
- patterns for the set begin eight bytes later (from each label). Therefore, the
- byte corresponding to chr(0) is staggered by one byte for each set (which
- explains the other eight bytes needed above and beyond the 256 required for the
- set).
-
- When using the set manipulation routines, you should always pass the
- address of the mask byte (i.e., the seg/offset of one of the labels above) to
- the particular set manipulation routine you're using. Passing the address of
- the structure created with the macro above will reference only the first set in
- the group.
-
- Note that you can use the set operations for fast pattern matching
- applications. The set membership operation, for example, is much faster than
- the strspan routine found in the string package. Proper use of character sets
- can produce a program which runs much faster than some of the equivalent string
- operations.
-
-
- 8.1 Createsets
-
- * Allocates storage for eight character sets on the stack.
-
- Inputs: None.
-
- Outputs: ES:DI- Pointer to eight sets.
- Carry=0 if no error.
- Carry=1 if insufficient memory to allocate storage for sets.
-
- Include: stdlib.a
-
- Createsets allocates 272 bytes on the heap. This is sufficient room for
- eight character sets. It then initializes the first eight bytes of this
- storage with the proper mask values for each set. Location es:0[di] gets set
- to 1, location es:1[di] gets 2, location es:2[di] gets 4, etc. The createsets
- routine also initializes all of the sets to the empty set by clearing all the
- bits to zero.
-
- Example:
-
- createsets
- jc NoMemory
- mov word ptr SetPtr, di
- mov word ptr SetPtr+2, es
- ;
-
-
- 8.2 EmptySet
-
- * Clears all of the bits for a particular set to zero.
-
- Inputs: ES:DI- pointer to first byte of desired set.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Emptyset clears out the bits in a character set to zero (thereby setting
- it to the empty set). Upon entry, es:di must point at the first byte of the
- character set you want to clear. Note that this is not the address returned by
- createsets. The first eight bytes of a character set structure are the
- addresses of eight different sets. ES:DI must point at one of these bytes upon
- entry into emptyset.
-
- Example:
-
- les di, SetPtr
- add di, 3 ;Point at 4th set in group.
- emptyset
- ;
-
-
- 8.3 RangeSet
-
- * Adds all of the elements between two values to a set.
-
- Inputs: ES:DI- pointer to first byte of desired set.
- AL- Lower bounds for range of items.
- AH- Upper bound for range (must be greater than AL).
-
- Outputs: None.
-
- Include: stdlib.a
-
- Rangeset adds in (via a UNION operation) to a set a range of values.
-
- Example:
-
- les di, SetPtr
- add di, 4 ;Point at 5th set in group.
- mov al, 'A' ;Add in the alphabetic chars
- mov ah, 'Z'
- rangeset
- ;
-
-
- 8.4 Addstr, Addstrl
-
- * Adds all of the characters from a string to a set.
-
- Inputs: ES:DI- pointer to first byte of desired set.
- DX:SI- pointer to string to add to set (Addstr only).
- CS:RET-pointer to string to add to set (Addstrl only).
-
- Outputs: None.
-
- Include: stdlib.a
-
- Addstr lets you add a group of characters to a set by specifying a string
- containing the characters you want in the set. To Addstr you pass a pointer to
- a zero-terminated string in dx:si. Addstr will add (union) each character from
- this string into the set. Addstrl lets you specify the string as a literal
- constant immediately after the call to addstrl.
-
- Example:
-
- les di, SetPtr
- add di, 1 ;Point at 2nd set in group.
- mov dx, seg CharStr ;Pointer to string containing
- lea si, CharStr ; chars to add to set.
- addstr ;Union in these characters.
- ;
- les di, SetPtr ;Point at first set in group.
- addstrl
- db "AaBbCcDdEeFf0123456789",0
- ;
-
-
- 8.5 Rmvstr
-
- * Removes all of the characters in a string from a set.
-
- Inputs: ES:DI- pointer to first byte of desired set.
- DX:SI- pointer to string to remove from set (Rmvstr only).
- CS:RET-pointer to string to remove from set (Rmvstrl only).
-
- Outputs: None.
-
- Include: stdlib.a
-
- Rmvstr is the converse operation to Addstr. It removes from a set the
- characters appearing in the associated character string. Rmvstrl works the
- same way except you pass the string of characters immediately after the call
- rather than via a pointer in DX:SI.
-
- Example:
-
- les di, SetPtr
- add di, 1 ;Point at 2nd set in group.
- mov dx, seg CharStr ;Pointer to string containing
- lea si, CharStr ; chars to add to set.
- rmvstr ;Remove these characters.
- ;
- les di, SetPtr ;Point at first set in group.
- rmvstrl
- db "AaBbCcDdEeFf0123456789",0
- ;
-
-
- 8.6 AddChar
-
- * Adds a single character to a set.
-
- Inputs: ES:DI- pointer to first byte of desired set.
- AL- character to add to the set.
-
- Outputs: None.
-
- Include: stdlib.a
-
- AddChar lets you add a single character (passed in AL) to a set.
-
- Example:
-
- les di, SetPtr
- add di, 1 ;Point at 2nd set in group.
- mov al, Ch2Add ;Character to add to set.
- addchar
-
-
- 8.7 RmvChar
-
- * Removes a single character from a set.
-
- Inputs: ES:DI- pointer to first byte of desired set.
- AL- character to remove from the set.
-
- Outputs: None.
-
- Include: stdlib.a
-
- RmvChar lets you remove a single character (passed in AL) from a set.
-
- Example:
-
- les di, SetPtr
- add di, 1 ;Point at 2nd set in group.
- mov al, Ch2Rmv ;Character to add to set.
- rmvchar
-
-
- 8.8 Member
-
- * Checks a character value to see if it is in the set..
-
- Inputs: ES:DI- pointer to first byte of desired set.
- AL- character to check.
-
- Outputs: Zero flag=1 if character is in the set.
- Zero flag=0 if character is not in the set.
-
- Include: stdlib.a
-
- Member lets you check for set membership, that is, it lets you see if a
- character value is present in some set. This routine is probably the
- most-often called routine in the collection of set routines.
-
- Example:
-
- les di, SetPtr
- add di, 7 ;Point at 8th set in group.
- mov al, Ch2Chk ;Character to check.
- member
- je IsInSet
- ;
-
- 8.9 CopySet
-
- * Copies one set to another.
-
- Inputs: ES:DI- pointer to first byte of destination set.
- DX:SI- pointer to first byte of source set.
-
- Outputs: None.
-
- Include: stdlib.a
-
- CopySet copies the items from one set to another. This is a straight
- assignment not a union operation. After the operation the destination set is
- identical to the source set, both in terms of the element present in the set
- and absent from the set.
-
- Example:
-
- les di, SetPtr
- add di, 7 ;Point at 8th set in group.
- mov dx, seg SetPtr2 ;Point at first set in group.
- lea si, SetPtr2
- copyset
- ;
-
- 8.10 SetUnion
-
- * Unions (adds) the members of one set into another.
-
- Inputs: ES:DI- pointer to first byte of destination set.
- DX:SI- pointer to first byte of source set.
-
- Outputs: None.
-
- Include: stdlib.a
-
- The SetUnion routine computes the union of two sets. That is, it adds all
- of the items present in a source set to a destination set. This operation
- preserves items present in the destination set before the SetUnion operation.
-
- Example:
-
- les di, SetPtr
- add di, 7 ;Point at 8th set in group.
- mov dx, seg SetPtr2 ;Point at first set in group.
- lea si, SetPtr2
- unionset
- ;
-
- 8.11 SetIntersect
-
- * Computes the intersection of two sets.
-
- Inputs: ES:DI- pointer to first byte of destination set.
- DX:SI- pointer to first byte of source set.
-
- Outputs: None.
-
- Include: stdlib.a
-
- Setintersect computes the intersection of two sets, leaving the result in
- the destination set. The new set consists only of those items which previously
- appeared in both the source and destination sets.
-
- Example:
-
- les di, SetPtr
- add di, 7 ;Point at 8th set in group.
- mov dx, seg SetPtr2 ;Point at first set in group.
- lea si, SetPtr2
- setintersect
- ;
-
- 8.12 SetDifference
-
- * Computes the difference of two sets.
-
- Inputs: ES:DI- pointer to first byte of destination set.
- DX:SI- pointer to first byte of source set.
-
- Outputs: None.
-
- Include: stdlib.a
-
- SetDifference computes the result of (ES:DI) := (ES:DI) - (DX:SI). The
- destination set is left with its original items minus those items which are
- also in the source set.
-
- Example:
-
- les di, SetPtr
- add di, 7 ;Point at 8th set in group.
- mov dx, seg SetPtr2 ;Point at first set in group.
- lea si, SetPtr2
- setdifference
- ;
-
- 8.13 NextItem
-
- * Locates the next (first) available item in a set.
-
- * Searches for items in ascending order using the ASCII collating
- sequence.
-
- Inputs: ES:DI- pointer to first byte of set.
-
- Outputs: AL- Contains first item found in set (zero if the set is empty).
-
- Include: stdlib.a
-
- NextItem searches for the next available item in a set. It returns the
- ASCII code of the character it finds in the AL register. If the set is empty,
- NextItem returns zero (since chr(0) is illegal). This call does not affect the
- set in any way. In particular, after the call the character located will still
- be present in the set.
-
- Example:
-
- les di, SetPtr
- add di, 7 ;Point at 8th set in group.
- nextitem
- mov ch2, al
- ;
-
- 8.14 RmvItem
-
- * Locates the next (first) available item in a set and then removes
- that item from the set.
-
- * Searches for items in ascending order using the ASCII collating
- sequence.
-
- Inputs: ES:DI- pointer to first byte of set.
-
- Outputs: AL- Contains first item found in set (zero if the set is empty).
-
- Include: stdlib.a
-
- RmvItem searches for the next available item in a set. It returns the
- ASCII code of the character it finds in the AL register and removes that item
- from the set. If the set is empty, NextItem returns zero (since chr(0) is
- illegal).
-
- Example:
-
- les di, SetPtr
- add di, 7 ;Point at 8th set in group.
- rmvitem
- mov ch3, al
- ;
-